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ABSTRACT 

We propose a computational approach to modeling the collective dynamics of populations of 
coupled heterogeneous biological oscillators. In contrast to Monte Carlo simulation, this ap- 
proach utilizes generalized Polynomial Chaos (gPC) to represent random properties of the pop- 
ulation, thus reducing the dynamics of ensembles of oscillators to dynamics of their (typically 
significantly fewer) representative gPC coefficients. Equation-Free (EF) methods are employed 
to efficiently evolve these gPC coefficients in time and compute their coarse-grained stationary 
state and/or limit cycle solutions, circumventing the derivation of explicit, closed-form evolu- 
tion equations. Ensemble realizations of the oscillators and their statistics can be readily recon- 
structed from these gPC coefficients. We apply this methodology to the synchronization of yeast 

*To whom correspondence should be addressed: yannisSPrinceton . edu; 
+1-609-258-2818 



1 



glycolytic oscillators coupled by the membrane exchange of an intracellular metabolite. The het- 
erogeneity consists of a single random parameter, which accounts for glucose influx into a cell, 
with a Gaussian distribution over the population. Coarse projective integration is used to acceler- 
ate the evolution of the population statistics in time. Coarse fixed-point algorithms in conjunction 
with a Poincare return map are used to compute oscillatory solutions for the cell population and 
to quantify their stability. 



INTRODUCTION 

Autonomous oscillations are observed in biological systems ranging in complexity from microor- 
ganisms to human beings y^. Often these oscillations are generated at the cellular level through 
positive feedback loops embedded in gene regulatory or metabolic networks Robust strate- 
gies have evolved, based on intracellular coupling of multiple oscillators (rather than on a single 
cellular oscillator subject to degradation and failure). A fundamental feature of such multicellular 
systems is the synchronization of individual oscillators to produce a coherent overall rhythm a. 
Synchronized oscillators are responsible for rhythm generation at second (heartbeat generation), 
daily (circadian timekeeping) and monthly (menstrual cycles) time scales. Malfunctioning of 
these interconnected oscillators can produce disastrous consequences such as sudden heart at- 
tacks and epileptic seizures. 

The development of mathematical models to study the synchronization of coupled oscillators 
has a long and beautiful history 410. Seminal contributions to the fundamental understanding 
of synchronization have been made by rigorous mathematical analysis of simple model sys- 
tems fl H). The study of mechanistic models of multicellular biological systems typically in- 
volves computational approaches such as dynamic simulation and numerical bifurcation analy- 
sis. Difficulties in applying such scientific computing tools are largely determined by population 
model complexity, which in turn is determined by the complexity of the individual cell model and 
the number of cells included in the population. Certain applications require both a detailed single 
cell model and a large ensemble of single cells for meaningful computational study. For instance, 
a single mammalian circadian oscillator contains multiple interconnected feedforward and feed- 
back loops that have been modeled with up to 73 coupled differential equations Meanwhile, 
the mammalian circadian system is comprised of approximately 10,000 individual oscillators that 
communicate via neurotransmitter mediated coupling (7). The development of stochastic simu- 
lations to generate meaningful population statistics is possible only if the model ensemble size 
is sufficiently large. Therefore, there is considerable motivation to develop efficient simulation 
and bifurcation analysis techniques for large, heterogeneous ensembles of coupled, complex bio- 
logical oscillators. Previous studies of heterogeneity used idealized phase models (a single 
"phase" equation for each oscillator); in this work, we show how the approach can be applied to 
ensembles of realistic limit cycle oscillator models. 

Among traditional techniques employed in the study of ensemble statistics of stochastic sys- 
tems, the most popular one is Monte Carlo simulation (llOl) (or the cell-ensemble method in 
the biological context). This approach, however, becomes extremely time-consuming when the 
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number of simulation realizations is large, as in the case of multicellular coupled oscillator pop- 
ulations. As an alternative, the stochastic Galerkin method for uncertainty quantification (UQ) 
has been widely used in recent years for solving stochastic ODEs or PDEs. Pioneering work 
along these lines (Il2h studied stochastic systems with Gaussian random variables: by viewing 
randomness as an additional dimension, beyond the space and time dimensions, the dependence 
of system responses on random parameters is represented in terms of orthogonal polynomial ex- 
pansions of random variables. In this context (see Appendix A) all orthogonal polynomials of a 
given order are called Polynomial Chaoses or Homogeneous Chaoses of that order; projections of 
the system responses onto the Polynomial Chaos (PC) coefficients evolve deterministically, and 
equations for their evolution can in principle be obtained, and then solved, by applying a Galerkin 
projection. The method has been applied for uncertainty quantification purposes in various physi- 
cal and engineering systems including structures with random properties (Il2h . porous media (Il3r) . 
fluid dynamics (14) and chemical reactions (15). In Jl^ . the method was extended: generalized 
Polynomial Chaos (gPC), applicable to a variety of continuous and discrete probability measures, 
was proposed based on the Askey scheme. In addition to representations of orthogonal polyno- 
mials, this method was also improved in other directions, in the form of piecewise (h-refinement) 
representations (Il7l) and wavelet expansions dlSl) . 

One advantage of the stochastic Galerkin method, as compared to direct Monte Carlo simu- 
lation, is that it can reduce a stochastic system to a deterministic one with (often significantly) 
fewer degrees of freedom, thus accelerating computation and saving data storage space. In order 
to apply the stochastic Galerkin method, however, one must derive equations for the tem pora l 
evolution of gPC coefficients either explicitly or through a pseudospectral approach (e.g. (1191) ') 
and develop a new code for the solution of these equations. To circumvent this additional effort, 
Equation-Free (EE) methods ( 2^ 21 , 22 ) have been utilized recently to quantify propagation of 
uncertainty in a stochastic system by evolving gPC coefficients of random solutions using the 
system dynamic simulator in a nonintrusive way, that is, without deriving the corresponding ex- 
plicit gPC evolution equations (1231) . Within this multiscale equation-free framework, the original 
stochastic dynamics code is viewed as a fine-level simulator, while the (unavailable) ODEs for 
the time-evolution of the gPC coefficients are viewed as a coarse-grained system model. These 
equation-free algorithms are built based on protocols that enable communication between differ- 
ent levels of system description; the lifting protocol translates coarse-grained initial conditions to 
one or more consistent fine scale initial conditions; the restriction protocol computes the coarse- 
grained description (values of the coarse variables, "observables") of fine scale system config- 
urations. The success of this class of methods relies on the assumption that closed evolution 
equations for the dominant (low-order) gPC coefficients exist in principle, even if they are not 
explicitly available. 

The purpose of this paper is to demonstrate that efficient simulation strategies for large pop- 
ulations of coupled biological oscillators can be developed by utilizing these equation-free un- 
certainty quantification (EF-UO) based methods. A six-dimensional cellular model of yeast gly- 
colytic oscillations (1241 l25l) is used to study synchronization of 1,000 heterogeneous cells in a 
well mixed environment. The random variable (which characterizes the cell population hetero- 
geneity) is chosen as the glucose influx for each cell. In the context of EF-UQ, we demonstrate 
coarse projective integration, which accelerates temporal simulation of the cell population dy- 
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namics, and coarse fixed-point computation combined with Poincare return maps to efficiently 
converge on limit cycle solutions, corresponding to synchronous population oscillations. Limits 
of the applicability of the procedure are discussed, including an extension of the basic methodol- 
ogy to handle loss of synchronization when isolated outlier cells "detach" from the main coherent 
population and develop individual oscillatory characteristics. 



A MECHANISTIC SYNCHRONIZATION MODEL FOR YEAST 
GLYCOLYTIC OSCILLATIONS 

The yeast Saccharomyces cerevisiae exhibits autonomous oscillations with a period of approxi- 
mately one minute when grown under anaerobic conditions Similar oscillations have 
been observed in other yeast strains (2^ S^) as well as algae (31), muscle (32), heart and 
tumor cells. Yeast studies suggest that an autocatalytic reaction involving the glycolytic en- 
zyme phosphofructokinase is the main cause of oscillations at the single cell level. Therefore, the 
observed limit cycle behavior has been termed glycolytic oscillations. Additional experimental 
work has focused on characterizing the intercellular mechanisms involved in the synchroniza- 
tion of individual yeast cell oscillations Jssl) . Typically, oscillations at the cell population level 
are observed by continuous monitoring of the average intracellular NADH concentration using 
fluorometry. Experiments with Saccharomyces cerevisiae grown in anaerobic stirred cuvettes 
suggest that secreted acetaldehyde is the key signaling molecule in the synchronization mecha- 
nism (ESS- 

A number of simple cell models have been developed to capture the glycolytic oscillation 
mechanism in yeast (13 8L Cell models based on more detailed descriptions of the 

glycolytic reaction pathway also have been proposed (I42L 43). Small ensembles of single cell 
models have been used to investigate the synchronization phenomenon (|38L 41 , 42 ). In this paper, 
we use a single cell model of intermediate complexity (|25[) to demonstrate our computational 
framework for simulating large populations of coupled biological oscillators. Our cell ensemble 
model is based on an intracellular coupling mechanism involving the transport of acetaldehyde 
across the cell membrane (1241 1251). 

A single cell in the population is described by the following differential equations: 
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where the index i denotes the cell. The pathway model accounts for glucose flux into the cell 
(Jo), metabolism of glucose to produce intracellular glycerol, ethanol and a combined acetalde- 
hyde/pyruvate pool (hereafter called acetaldehyde), acetaldehyde flux out of the cell (J), and 
degradation of extracellular acetaldehyde by cyanide. Glycolytic intermediates modeled are in- 
tracellular glucose (Si), the glyceraldehyde-3-phosphate/ dihydroxyacetonephosphate pool (S2), 
1,3-bisphosphoglycerate (S^) and intracellular acetaldehyde (6*4). Each co-metabolite pair is as- 
sumed to be conserved with A and denoting the constant concentrations of the ADP/ATP and 
NAD+/NADH pools, respectively. Therefore, only the NADH (N2) and ATP (^3) concentrations 
are treated as independent variables. The intracellular reaction rates V2-vq depend linearly on 
the metabolite and co-metabolite involved in each reaction, while the acetaldehyde degradation 
rate, denoted vj, depends linearly on the extracellular acetaldehyde concentration. Individual cell 
oscillations are attributable to the nonlinear term in the reaction rate vi that accounts for ATP 
inhibition. 

The net flux of acetaldehyde from the i-th cell into the extracellular environment is modeled 
as Ji = k(5'4 j — S^^ex) where S^^^x is the extracellular acetaldehyde concentration and k is a 
coupling parameter related to the cell permeability. A mass balance on extracellular acetaldehyde 
is derived under the assumption that the volume fraction of cells relative to the total medium 
volume ((fi) remains constant as the total number of cells M is varied: 

dS ^ 

~~dt^ = X/ "'^ ~ ^'^ ^ M ^ '^('^4,1 - S4^^ex) - kSi^ex (V) 
i=l i=l 

where k is the kinetic constant of the acetaldehyde degradation reaction. The chosen parameter 
values produce an asymptotic solution in which all cells are synchronized regardless of the 
cell number. The total number of differential equations (n) in the cell ensemble model increases 
linearly with the number of intracellular metabolites (6) and the number of cells (M): n = 
6M + 1. Unless otherwise stated, the following simulations involve 1000 cells: n = 6001. 
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APPLICATION OF EF-UQ COMPUTATION TO YEAST GLY- 
COLYTIC OSCILLATIONS 

Polynomial chaos representation of cell properties 

Our cell ensemble model of yeast population dynamics consists of a large set of coupled nonlin- 
ear ODEs. The intracellular metabolite concentrations in such problems are in general random 
variables; many different sources of randomness exist, from intrinsic kinetic fluctuations, to pop- 
ulation heterogeneity, to randomness in the initial conditions. In our particular case we consider 
the randomness arising from population heterogeneity; there is a single random parameter, the 
glucose flux Jq: Jq = Jq + aj^{uj), where ^ is normal over the sampling space Q. This simple 
choice of a normally distributed uncertainty is made for illustration/validation purposes; the pro- 
cedure is directly applicable to different distributions of uncertainty, as we will discuss below. As 
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our heterogeneous population evolves, each intracellular concentration evolves, and so therefore 
do the corresponding concentration distributions over the population. The "obvious" collective 
variables for evolving distributions are the first few moments of the distribution: mean, vari- 
ance etc.; it might, at first sight, appear that good coarse-grained observables of our population 
state would be these first few moments of the individual intracellular concentration distributions. 
These moments, however, do not take into account correlations across the distributions (8); in 
our heterogeneous population it is not enough to know how many cells have a certain intracellular 
metabolite concentration - we must also know which cells have this concentration: the cells with 
higher or with lower intrinsic values of Jo ? 

We have observed in our simulations (and the same phenomenon has been documented in 
different heterogeneous coupled oscillator contexts (|^) that, after a typical initialization, two 
distinct phases are observed in the dynamics. During an initial, relatively fast phase, strong cor- 
relations develop between the distributions of the various intracellular concentration values; this 
is followed by a second, long term phase, during which the distributions evolve, but with the 
correlations "locked in". In effect, these correlations appear to be strongly related to the hetero- 
geneity of the population - cells with different Jq exhibit systematically different concentration 
patterns. Figure [T] shows the dependence of concentrations of NADH and ATP on the population 
heterogeneity parameter (the random variable Jq) three instantaneous states. There is clearly a 
relation/dependence between the intracellular metabolite concentrations and the "identity" of the 
cell (the value of Jo). 

If the simulation is continued from the exact state where it was interrupted, these correlations 
remain in place (see the red curves in Figures |2l and |3ll. If we now create new, artificial initial 
conditions, where leading moments of the distributions of individual intracellular concentrations 
are retained but in which the correlations are deliberately scrambled then the system will quickly 
move away from the synchronized oscillation in a violent transient (see the cyan curves in Figures 
|2and|31), and will take a long time to return to it, rebuilding the correlations in the process. 
These numerical experiments suggest that strong correlations between cell heterogeneity and cell 
behavior get established during initial stages of the population response, and are then retained 
in the long term dynamics. Based on this observation, and on the functional dependence of 
intracellular concentrations on our random variable clearly apparent in Figure[Tl we will assume 
that all intracellular concentrations across the population can, in the long-term dynamics, be 
expressed as (unknown) functions of the same random variable. 

Denoting 

x{i{uj),t) = {SM^).t). sM^).t). sM^).t). sM^),t), NM^),t),AM^),t)f, 

we will represent x in terms of a truncated Polynomial Chaos expansion of 

p 

x{i,t) = Y,^i{t)^,{i). (8) 

3=0 

The fine scale state is the 6001-long vector of dependent variables (the 6000 intracellular con- 
centrations plus one extracellular one) in our detailed set of coupled ODEs. Our coarse-grained 
observables are the gPC truncation coefficients, a^^(t) = 2:2^^? ^3,c5 ^4,0 ^5,0 ^6,c)^- The 
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lifting step, the construction of ensemble realizations of intracellular concentrations (fine-level 
states) consistent with a particular set of values of gPC coefficients (coarse-level observables) is 
performed through 

p 

= X;*'cW^ite), ^ = 1,2,---,M. (9) 

j=0 

Here M(= 1000) is the total number of cells, the truncation level P is set to 3, and {^j} are 
orthonormal Hermite polynomials (1441) for the case of normally distributed ^ . We reiterate that 
different types of polynomials can be used for random variables obeying different distributions. 
The total number of our coarse model states is thus 4x6 + 1 = 25 (the deterministic extracellular 
concentration S^^ex is counted here as a single coarse observable); this is clearly much less than 
the number of internal states, 6M + 1, in the original cell dynamics. 

Obtaining the coarse grained observables from a detailed state constitutes the restriction step; 
our restriction protocol consists of the inner product, Eq. [121 which can be performed in one of 
two ways: (i) we can discretize the integral to approximate the inner product, i.e., < x, >= 
'^/Mj^jLi ^(O)^i(O)' (ii) perform a simple least squares fitting to find xi such that 

an norm | t) — ^2^=0 ^c(^)^i(0 1 1 is minimized. We used the second implementation to 
obtain gPC coefficients in this work. 

In what follows, we will demonstrate two distinct ways of using the EF-UQ methodology in 
order to accelerate population computations for our model problem. We will first demonstrate 
direct simulation acceleration through coarse projective integration. We will then demonstrate 
the accelerated computation and coarse-grained stability analysis of synchronized population os- 
cillations through matrix-free fixed point computation and eigenvalue approximation; these syn- 
chronized limit cycles will be computed as fixed points of a coarse Poincare map. By doing so, 
we demonstrate that our multiscale toolkit provides a generally applicable and computationally 
efficient framework for dynamic simulation and analysis of heterogeneous populations of cellular 
oscillators. 



Full direct simulation and coarse projective integration 

The full direct simulation consists of integrating of 6M + 1 differential equations (M is the num- 
ber of cells, 6 internal states for each cell, and 1 extracellular variable). The package ODETools 
is used in Matlab for this simulation, with variable step-size chosen for relative error tolerance 
of 1 X 10"^ and absolute error tolerance of 1 x 10"^^. Figure IH shows the time history of such 
a full simulation of an ensemble of cells over one complete period for the distribution of pa- 
rameter Jo having mean 2.3 and standard deviation 1 x 10""^. At discrete moments in time, the 
values of the full ensemble are projected to the gPC basis (the full state is restricted onto coarse 
observables), also shown in Figure |4| At the fine scale, the concentration of each intracellular 
metabolite oscillates and, at the end of an oscillation, returns exactly to its starting value. At the 
coarse, macroscopic level, it is the gPC coefficients that return to themselves. This is, in effect, 
saying the distributions of the species concentrations return to their initial values. 

We now explain the projective integration procedure and compare its results to the direct 
simulation. Starting at t = to from a given coarse initial condition (values of the first few gPC 
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coefficients, the observables), we generate fine-level model states by the lifting procedure Eq. 
|9l We use the original, full dynamic simulation code to evolve the fine-level description for 
an initial time interval theau we continue the full direct simulation over three successive time 
intervals of time 5t. At the end of each of these intervals, the coarse observables are obtained by 
restriction. The temporal derivatives of the coarse observables can then be estimated (here for 
convenience we use least squares, but maximum-likelihood based methods are more appropriate 
(1451) ): these local time derivatives are then used to project the values of the coarse variables 
at a time h5t further in the future. Using these predicted values we begin the same cycle of 
brief healing, detailed simulation observed by restriction, estimation of local time derivatives and 
projection of the coarse behavior into the future. Depending on the coarse projection algorithm 
used in the forward-in-time projection, more (or fewer) restrictions from the short burst of full 
simulation may be needed. The particular values of th^ai and 5t can also vary from those used 
here {theai = 5 x 10~^, 5t = 5 x 10"'^), and their on-line optimal selection is problem-dependent. 

Figure 13 demonstrates the acceleration of the full simulation through a coarse projective for- 
ward Euler algorithm. Note that for every A5t of full direct simulation, we project h5t into the 
future - only 44.44% of the work is necessary this way. This type of computational savings 
enables the simulation of larger cell ensembles, allowing more accurate reconstructions of pop- 
ulation statistics. As discussed in the method can provide significantly larger savings if 
a separation of time scales exists in the problem; for linear problems this can be readily seen 
as a gap between a few leading, slow, and the remaining many, fast eigenvalues of the system. 
Different coarse projective algorithms (e.g. projective Runge-Kutta, or even implicit projective 
algorithms) can also be used; linking them to modem estimation techniques and extending them 
to account for adaptive projective step size selection is the subject of ongoing research (see (l47r) 
as well as (i48i)). 



Coarse-grained limit cycle computations 

Limit cycle computation. Beyond direct simulation, which asymptotically approaches stable 
limit cycles, periodic orbits are located by solving boundary value problems in time. In particular, 
they can be located as fixed points of a Poincare map (see, for example, the textbook (B). A 
Poincare Section, S, is a hypersurface (often a hyperplane) crossing a limit cycle transversely at 
an (isolated) point. The Poincare map, P : S ^ S,is a return map defined by 

P{x) = <Pt{x) e S, 

for X E S, t the smallest positive time for which (ptix) E S, and 0^ the time t flow map. Note 
that it is not necessary to know a priori the period of the limit cycle when defining the Poincare 
map. Let Pf be the Poincare map for the full simulation and Pc be the Poincare map for the 
coarse simulation. The two maps are related by 

Pc = RoPfoL 

for R, L the restricting and lifting operators and o is composition. We use the Newton-Krylov 
GMRES method to find solutions of Pc(^c) = (0. This implementation of Newton iteration 
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does not explicitly require the computation of the Jacobian of the equation to be solved (often 
computed in shooting methods through integration of the variational equations). The action of 
this Jacobian along sequentially selected directions (directional derivatives) is estimated from the 
results of simulations starting at appropriately chosen nearby initial conditions. Since this only 
requires simulations of the problem, and the linear equations are solved without ever assembling 
the relevant Jacobian, this is a matrix-free implementation. 

Note that the Poincare return map can be constructed, not only by direct simulation, but also 
by projective integration. That is, projective integration can be implemented to accelerate the 
computations of the Poincare map itself. The limit cycle in Figure |5] was found by solving this 
fixed point problem. 

Limit cycle stability. For multiscale problems with limit cycles, we can discuss stability at 
the fine-scale and also at the coarse-grained level. We describe here limit cycle stability compu- 
tations at both levels, and it is interesting that the results are essentially the same - the coarse- 
grained stability computations reveal the same information as the (computationally intensive) 
fine- scale computations. 

Let T be the period of the limit cycle, and let Xq be a point on the limit cycle. The time T 
flow map is ^^(ajo) = x{t] Xq). Eigenvalues {Floquet multipliers) of the matrix Dcpxixo) (the 
Jacobian of (px at xq, the so-called monodromy matrix) quantify the linearized stability of the 
limit cycle. The time T coarse flow map is = ^riXc), see Eq. O The matrix Z}$r(Xc) 
can be approximated by finite differences, 

' $T(X, + eei)-X, <!>T{X, + ee2)-X, 

, , . . . 

e e 

for e << 1. The eigenvalues of D$t(^c) are shown in Figure |6l and the leading ones are 
listed in Table [H Note that the number 1 is an eigenvalue (corresponding to time translational 
invariance along the limit cycle at the point uq). This invariance gives rise to a neutrally stable 
direction; all limit cycles possess this neutral eigenvalue at 1. The magnitudes of the remaining 
leading eigenvalues of the limit cycle are less than 1 (they lie inside the unit circle). Therefore, 
the limit cycle is stable. 

Eigenvalues of the fine-scale monodromy matrix reveal the stability of the full limit cycle. 
Divided differences could in principle be used to approximate the full 6001 x 6001 Jacobian of 
the fine scale problem; instead, to obtain this matrix we integrated the variational equations (see, 
for example, the textbook Let X = f{x); then the variational equations along the solution 
x(t, Xq) are 

U'{t,Xo) = Dfix{t,Xo))-U{t,xo), 

where U (t) is a (6M + 1) x (6M + 1) matrix and the initial condition U (0) is the identity matrix. 
This gives 

D(l)^j.{xo) = UiT,xo). 

The "fine" eigenvalues are shown in Figure|6l and the leading ones are in Table[T] 

It can be shown that eigenvalues of the coarse and fine- scale monodromy matrices are the- 
oretically the same (if sufficiently many gPC coefficients are kept), and the coincidence of the 
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eigenvalues is seen in the table. Analysis of the fine-scale model considered here is computation- 
ally tractable; for larger problems, however, a full analysis may not be possible, and the coarse 
stability analysis suffices to characterize stability of the fine-scale limit cycles. For instance, these 
methods could be profitably a ppl ied to large models of coupled biological oscillators involved in 
yeast respiratory oscillations (pit) and mammalian circadian rhythm generation 



When coarse-graining fails: "rogue" oscillators 

When the variance of Jo is small, NADH concentrations in the cells across the population 
are narrowly distributed; an overall synchronized solution for the entire population prevails, and 
our coarse-graining methods are successful. When the variance of Jo increases, however, the 
amplitudes of the N2 oscillations across the cell population will separate significantly. For cer- 
tain combinations of mean and variance of Jq, one or more cells will appear to oscillate "freely" 
in amplitude. An illustration of this, using a 50 cell ensemble as the full direct simulation, and 
setting the mean and standard deviation of Jq to 2.1 and 0.08, respectively is seen in Figure IT] 
The cause of this single "free, oscillating" cell can be clearly rationalized based on the relatively 
large variance of Jq: the values of Jq across the population spread beyond the parameter point at 
which the single cell dynamics undergoes a Hopf bifurcation. The strong correlation between in- 
dividual cell Jo values and their detailed states (intracellular concentrations) is retained for most 
of the cells, but lost for the outlying "rogue" oscillatory cell. As ctj continues increasing, the 
strong correlations that allowed us to use a gPC expansion for the collective behavior fail even 
more dramatically - see Figure |71 which is quite representative of dynamics in populations with 
large variance of the glucose influx. We expect that such strong discontinuities in the relation 
between heterogeneity and detailed state (reminiscent, at some level, of Gibbs phenomena) will, 
in general, be encountered in probabilistic problems involving strong nonlinearity and bifurca- 
tions of the single oscillator behavior as the heterogeneity parameter(s) is varied. Clearly, a few 
gPC coefficients are no longer good observables of the collective system state. The recent litera- 
ture includes some efforts in resolving such cases using a wavelet-based chaos expansion |li) or 
piecewise Polynomial Chaos 417.) ; yet the problem remains an open subject for future research. 

Here we will limit ourselves to the case of a single rogue oscillator, shown above. A simple 
and rational way to tackle this problem is to employ gPC coefficients to describe the oscillators 
that are "clumped together" with smaller oscillation amplitudes, and use an additional, different 
set of variables to describe intracellular metabolite concentrations for the single "freely oscillat- 
ing" cell. More specifically, for a total number M of cells which include one "free" cell, the 
coarse observables are comprised ofx^,xl,---,x^ (obtained by restricting intracellular concen- 
trations of M — 1 clumped cells through the least-square fitting method mentioned earlier), the 
extracellular concentration S^^ex and six intracellular concentrations 5*1, 5*2, 6*3, 6*4, N2, A-^ repre- 
sentative of the free cell. When lifting is now implemented, only intracellular concentrations of 
M — 1 cells are generated from a;°, a;J, ■ ■ ■ , ajf through Eq. |9l The intracellular concentrations 
Si, 5*2, S3, 5*4, A'^2, ^3 for the free cell are part of the coarse-grained description. Therefore, the 
full direct simulation is again characterized by 6M + 1 variables, but the number of observables 
of the reduced, coarse-grained problem increases to 4x6 + 1 + 6 = 31 if the first four leading 
order gPC coefficients are retained (P = 3). 
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The initial condition on coarse observables for the projective integration is found by using the 
equation-free fixed-point algorithm in conjunction with the Poincare map on the 31 -dimensional 
space. Coarse projective integration for this new, 31 -dimensional coarse grained system is il- 
lustrated in Figure |5l Our coarse initial condition is the restriction of a point on the detailed, 
fine-scale limit cycle solution; starting there, we use short bursts of full direct simulation to esti- 
mate the time derivatives of our 31 coarse observables, and then coarse projective forward Euler 
to project their values forward in time. 

The phase portrait of the coarse-grained limit cycle, projected on the leading order gPC coef- 
ficient of the concentration of NADH of the M — 1 clumped cells and the concentration of ATP 
in the free cell, is shown in Figure |5l Reasonable numerical agreement with the full direct os- 
cillation (observed on the same variables) prevails; even in the presence of one (more generally, 
of a few) free cell(s), choosing a good set of coarse-grained variables allows us to accelerate the 
computations of the long-term system dynamics. Of course, this approach requires a priori iden- 
tification of the rogue oscillating cells to construct the coarse variables. Clearly more research is 
needed towards the selection of good coarse observables for such problems. 



SUMMARY and CONCLUSIONS 

Equation-Free Uncertainty Quantification methods were used in this paper to accelerate the 
computer-aided analysis of the dynamics of heterogeneous ensembles of coupled biological os- 
cillators; in particular, coarse-grained computations of synchronized population- wide limit cycles 
and their stability was demonstrated for an ensemble of yeast glycolytic oscillators coupled by 
membrane exchange of intracellular acetaldehyde. The feasibility of the EE UQ in describing 
certain particular situations (where one -or a few- oscillator(s) move freely in amplitude, distin- 
guished from the "bulk" of the population) was also demonstrated. 

This paper contains only representative "proof of concept" computations. There is a clear 
necessity for extensive numerical analysis of the schemes illustrated, including adaptive step- 
size selection, error estimation and control. Different restriction schemes need to be devised 
when the relation between heterogeneity and behavior becomes nonsmooth or discontinuous in 



the "randomness direction(s)". Different lifting schemes (1531) exploiting a non-explicit separation 
of time scales may reduce the number of gPC coefficients required for an accurate reduced de- 
scription. Furthermore, a wealth of data-mining techniques currently under development (and, in 



particular, the diffusion map approach (1541551) ) holds the promise of extracting low-dimensional 
parameterizations of high-dimensional data based on graphs constructed on simulation data and 
the eigenfunctions of diffusion operators on these graphs. It would be interesting to explore the 
performance of such techniques when simple gPC observables fail, as in the case of the disconti- 
nuities and "multiple oscillator clumps" mentioned above. 
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APPENDIX A: SOME BASIC UNCERTAINTY QUANTIFI- 
CATION (UQ) AND EQUATION FREE UNCERTAINTY QUAN- 
TIFICATION (EF-UQ) ISSUES 

On polynomial chaos expansion of random variables and processes 

Orthogonal polynomials of random variables with an arbitrary probability measure (Gaussian, 
uniform, Poisson, binomial, ...) are called generalized Polynomial Chaos (gPC) Jl^ . Any func- 
tional vector, x{u), of random variables, ^(u;) (= {^i{uj), ■ ■ ■ , ,^„(ti;))^), defined over a probabil- 
ity space JF, P), can be expressed in terms of a gPC expansion, 

oo 

x{u)=J2K'^^Ui^))^ (10) 

i=0 

where '^i{^{uj)) is the ith generalized Polynomial Chaos which admits the following orthogonal- 
ity properties, 

, , f < >, if i = i, 

<^"^^>=|o, if iA ^''^ 

and xl is the corresponding coefficient of ^^(^(a;)), determined by 



< x{ u),^,{^{u )) > 

< > 



In the above equations, the inner product < •, ■ > is defined by 

< f{^),9{^) >= f{i)9{i)vmi, ^ = (ei(^), ■ ■ ■ , u^)Y. (i3) 

where p{$,) is the joint probability measure of ^ and V the support of p(^). 

In the case that a? is a random field or process having the form x{ijj , s) ox x{uj , t) where s and 
t are, respectively, spatial and time coordinates, the projections of x onto the Polynomial Chaos, 
x\, must admit a form that depends on these spatial or time coordinates as well. In our illustrative 
example the gPC coefficients evolve in (and thus also depend on) time. 



On the stochastic Galerkin method 

The stochastic Galerkin method aims at quantifying propagation of uncertainty in dynamical 
systems. In this method, solutions of stochastic systems are first expressed in terms of a finite 
linear combination of generalized Polynomial Chaos. The error resulting from the finite-term 
expansion is then required to be orthogonal to test functions, which are normally chosen to be 
the same as the generalized Polynomial Chaos. A coupled system of equations for the gPC 
coefficients can thus be derived and solved Probability distributions and statistical moments 
of the solutions can be computed from the gPC coefficients subsequently. 
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In particular, let the states of a stochastic system be represented by a vector x{uj,t) : Q x 
M — s> M^, where u is an element in the sampling space fi. This system state is governed by the 
differential equation 

dx 

— = f(x,^iuj)), x{iu,0) = Xo{iu). (14) 
The solution of the above equation can be approximated by a finite expansion in the form of Eq. 

M 

p 

x{u;,t) = J2<m^i^i^))■ (15) 

By applying a Galerkin projection, equations governing the gPC coefficients x^t) are obtained 
as 



1 P 

< /(E =^cW*.(^)), >, ^ = 0,1,■■■,P, 



dxl 
dt 

i=0 



(16) 



with xl.{0) = '^^°q,f^^'^}^ ■ The above equation can be rewritten as 

<>'> 

Here Xc = {x^, xl, ■ ■ ■ , x^)'^ and H = {ho, hi, ■ ■ ■ , hp)'^, where 

^'^^'^ = < ^c(^)*.(0), >, ^ = 0,1,■■■,P. 

If = in the long-time limit, then Eq. \1T\ has a steady state, which can then be used to 
recover the probability distribution of the random steady state of Eq. [TU The stochastic Galerkin 
method can provide an effective reduction of a stochastic model if a relatively small truncation 
in Eq. \T5\ above is sufficiently accurate. The corresponding deterministic model will then be 
considerably easier to simulate and analyze than large numbers of realizations in a Monte Carlo 
simulation of the original dynamics. 



On Equation-Free methods and their application in uncertainty quantifica- 
tion 

The basic building block of equation-free methods ( 2^ 21 , 22 ) is the coarse time-stepper, which 
consists essentially of three components: lifting, micro -simulation, and restriction. Lifting is a 
procedure to transform a coarse-level state to its fine-level counterpart and restriction the converse 
of lifting. By employing the model states x{t) in Eq. [l4las the fine-level system state vector, 
and their low-order gPC coefficients Xc as the coarse-level observables, Equation-Free methods 
can be used to numerically study the behavior of gPC coefficients without needing closed form 
ODEs for their evolution, Eq. [T7] (Hsl)- The lifting and restriction protocols in our context are 



13 



then Eq. [TSland Eq. [121 respectively. The micro-simulation is just the simulation of the original 
large coupled ODE system Eq. O Assuming that the long-term coarse-grained dynamics in 
gPC space lie on a low-dimensional, attracting slow manifold, one can use coarse projective 
integration to accelerate the computation of successive coarse-level system states, i.e., low-order 
gPC coefficients xl.{tj),i = 0, 1, ■ ■ ■ , P, j = 0, 1, ■ ■ ■ , M, through restriction of (one or more) 
short bursts of fine scale simulation as follows: Temporal derivatives ^ at t]\j are estimated by 
least-squares fitting of the short-time gPC coefficient evolution xl{tj),j = M — k, M — k + 
1, ■ ■ ■ , M, k < M. These gPC coefficients in conjunction with their locally estimated temporal 
derivatives are then used to extrapolate (in effect, integrate the gPC coefficients numerically in 
time) over a relatively large coarse time interval T. For instance, if the coarse forward Euler 
projective integration is used, then the predicted gPC coefficients at a later time tM + T are 
obtained by 

xl{tM + T)^xl{tM) + ^{tM)T, i = 0,l,---,P- (18) 

These projected in time gPC coefficients xl{tM + T) are lifted again to the full fine state level, 
and a new short burst of micro- simulation is initiated. The procedure is repeated until a desired 
time limit is reached. Issues of time-step selection, estimation and error control are important, 
and often "the devil lies in these details"; discussing these numerical analysis issues, however, is 
not the aim of this brief exposition, and we refer the reader to (21, 4^ 47, 5t 



If equation Eq. [T7]possesses a steady state X^, then X* must satisfy an integral form of Eq. 
fT71 given by 

X-^ = X-^+ / H{X,)dt (19) 

where Xc(to) = X^. The right-hand side of Eq. [191 can be viewed as a time flow <I>t : 
]^7Vx{i+p) X M ^ ]RiVx(i+p) Therefore, the steady state X^ is the fixed-point of the equation 

X, = <fT(X,). (20) 

When the above equation is not explicitly available, we can use the coarse time-stepper to ap- 
proximate the time flow <I>t- Newton's method or other iterative algorithms, often in matrix-free 
implementations, can be readily employed to compute steady-state gPC coefficients, which can 
be used to reconstruct random stable/unstable steady states of the original system, Eq. [I4l It is 
also possible to compute limit cycles of the gPC coefficients through Poincare return maps, thus 
linking EE methods with the analysis on random limit cycles. 
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Leading eigenvalues 

coarse eigenvalues full eigenvalues 



1.000004 
0.942622 
0.496275 
0.328651 
0.100267 ±0.068410z 
0.126292 



1.000219 
0.9426798 
0.4962654 
0.3273728 
0.100250 ±0.0683621^ 
0.126075 



Table 1 : Selected, leading eigenvalues from the computations of the coarse level and of the fine 
level are shown. The eigenvalue at 1 corresponds to a neutrally stable direction along the limit 
cycle. 
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Figure 1 : (color online) One period of the oscillation of [NADH] for an ensemble of yeast cells 
is shown (blue, left axis); one oscillation of the leading order gPC coefficient [ATP] is shown 
(green, right axis). The relationship of two intracellular concentrations, [NADH] (blue) and 
[ATP] (green), with respect to the heterogeneity of the glucose influx (Jq) is shown in the inset 
figures. Note the continuous dependence of the intracellular concentration on the parameter Jq. 
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Figure 2: (color online) Phase maps for variances of [NADH] and [AA] in the initial stage. 
Blue curves: variances computed from realizations located on the fine-level limit cycle. Red 
curves: variances computed from realizations initialized with the fully correlated lifting from a 
"blue" coarse initial conditions. The blue circle indicates the location of this initial condition. 
Magenta and cyan dashed curves: variances computed from realizations initialized with two 
random liftings from the same coarse initial conditions as above. Note that blue and red curves 
almost coincide with each other. 
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Figure 3: (color online) If the correlations of metabolites within a cell are not incorporated in 
the lifting (top: cyan dots at time to), then these wrong correlations remain (middle, ti) until a 
sufficiently long amount of time elapses (bottom, ti + 100). The blue line shows the mature, 
or natural, correlation. The red dots have been correctly initialized; the cyan points have been 
initialized with wrong initial correlations. For consistent computations, the relationships between 
the different metabolites in a cell must be replicated when an ensemble of cells is constructed (in 
the lifting step). 
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Figure 4: Top: The time series of [NADH] in cells of the ensemble are plotted against time t and 
the random parameter Jq (glucose influx) over one oscillation. The lines transverse to the time 
series curve are equal-time curves. Bottom left: Time series of the leading order gPC coefficients 
of [NADH] Bottom right: A projection of the coarse limit cycle onto the leading order [NADH] 
and [ATP] gPC coefficient plane. The distribution of the glucose influx parameter Jq has mean 
2.3 and standard deviation 0.001. 



23 




O.OB 0.1 0-12 0.14 0.16 0.18 02 0,22 
([MADH] (mM) 




0,162 0.1622 0.1624 0.1626 0,1628 0.163 0,1632 
feading order gPC coeff of INADH] (mM) 

Figure 5: Projective integration in coarse variables (blue) and natural evolution (black). Projec- 
tive integration accelerates computation of the evolution of a cell population. Top: An ensemble 
of size M = 1000 with the distribution of the glucose influx parameter having mean 2.3 and stan- 
dard deviation 0.001. Bottom: An ensemble of 50 cells, with the distribution of the glucose influx 
parameters having mean 2.1 and standard deviation 0.08. In this regime, there is one free cell, 
which oscillates at an amplitude relatively larger than the others. For each cell population sub- 
plot, the horizontal axis is the leading order gPC coefficient for [NADH]. Top: the vertical axis 
is the leading order gPC coefficent for [ATP]. Bottom: the vertical axis is [ATP] of the "rogue" 
ceU. 
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Figure 6: (color online) Leading eigenvalues of D(j)i (blue, x) and D(j)t (red, o). The leading 
eigenvalues are essentially the same, as expected, the unit circle is shown (black curve); note the 
eigenvalues at 1, and all other eigenvalues are inside the unit circle, analyzing stability of the 
coarse problem gives insight into the stability of the full cell ensemble. 
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Figure 7: (color online) The discontinuity in the relationship between [NADH] and the parameter 
Jo is shown for distributions with "rogue" oscillators. Top: Jo has a Gaussian distribution with 
mean and standard deviation being 2.1 and 0.08, respectively; bottom: mean 2.1 and standard 
deviation 0.3. The figures on the left show the time history of [NADH] for all 50 cells. The 
figures on the right show [NADH] as a function of Jo at a snapshot in time (t = 45). Equation- 
free computations can be applied in these regimes by using more and different coarse variables: 
gPC coefficients for the "bulk" of the oscillating cells and either gPC coefficients for the other 
cells or using all the state variables of the other cells. 
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